CHAPTER 1 Biostatistics 101 11

Comparing groups

In Part 4, we show you different ways to compare groups statistically.»

» In Chapter 11, you see how to compare average values between two or

more groups by using t tests and ANOVAs. We also describe their nonpara-

metric counterparts that can be used with skewed or other non-normally

distributed data.»

» Chapter 12 shows how to compare proportions between two or more groups,

such as the proportions of patients responding to two different drugs, using

the chi-square and Fisher Exact tests on cross-tabulated (cross-tab) data.»

» Chapter 13 focuses on one specific kind of cross-tab called the fourfold table,

which has exactly two rows and two columns. Because the fourfold table

provides the opportunity for some particularly insightful calculations, it’s

worth a chapter of its own.»

» In Chapter 14, you discover how the terminology used in epidemiologic

studies is applied to specifically formatted fourfold tables to calculate

incidence and prevalence rates.

Looking for relationships between variables

Epidemiology and biostatistics are interested in causal inference, which means try-

ing to figure out what causes particular outcomes in biological research. While it

is possible to look at the relationship between two variables in a bivariate analysis,

regression analysis is the part of statistics that enables you to explore the rela-

tionship between multiple variables and one outcome in the same model so you

can evaluate their relative cause of the outcome. Here are some use-cases for

regression:»

» You may want to know whether there’s a statistically significant association

between one or more variables and an outcome, even if there are other

variables in the model. You may ask: Does being overweight increase the

likelihood of getting liver cancer? Or: Is exercising fewer hours per week

associated with higher blood pressure measurements? In answering both

of those questions, you may want to control other variables known to

influence the outcome.»

» You may want to develop a formula for predicting the value of a variable from

the observed values of one or more other variables. For example, you may

want to predict how long a newly diagnosed cancer patient may survive

based on their age, obesity status, and medical history.